Evaluation of CASA and BSS models for cocktail-party speech segregation
نویسندگان
چکیده
For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency domain in a variable number of subbands, which are processed independently. Then, we evaluate the gain, using reference signals recorded in isolation. Without using this reference, a coherence index is also established for the BSS model, which measures the degree of convergence. After a careful analysis, we find gains of about 1-3dB for the two methods, which are smaller than those published for the same task. The variation of the number of subbands allows an optimisation, and we obtain a significant peak at 4 subbands for the CASA model, and a smaller maximum at 2 subbands for the BSS model.
منابع مشابه
Evaluation of CASA and BSS models for subband cocktail-party speech separation
For speech segregation, a recurrent blind separation model (BSS) is tested together with a CASA model, which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the fre...
متن کاملComparative evaluation of CASA and BSS models for subband cocktail-party speech separation
For speech segregation, a blind separation model (BSS) is tested together with a CASA model which is based on the localisation cue and the evaluation of the time delay of arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we cut the frequency doma...
متن کاملComparative evaluation of CA for subband cocktail-party
For speech segregation, a recurrent blind separation model (BSS) is tested together with a Computational Auditory Scene Analysis (CASA) model, which is based on the localisation cue and the evaluation of the Time Delay Of Arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For...
متن کاملA Casa Front-end Using the Localisation Cue for Segregation and Then Cocktail-party Speech Recognition
We propose and test a cocktail-party recognition technique based on segregation applied before recognition. This CASA front-end uses the TDOA (Time Delay Of Arrival) evaluated within subbands in order to determine the Relative Level (RL) of two competing speech sources. To perform the evaluation of the model, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed o...
متن کاملCocktail Party Processing
Speech segregation, or the cocktail party problem, has proven to be extremely challenging. This presentation describes a computational auditory scene analysis (CASA) approach to the cocktail party problem. This approach performs auditory segmentation and grouping in a two-dimensional time-frequency representation that encodes proximity in frequency and time, periodicity, amplitude modulation, a...
متن کامل